Goto

Collaborating Authors

 molecular subtype


Radiological and Biological Dictionary of Radiomics Features: Addressing Understandable AI Issues in Personalized Breast Cancer; Dictionary Version BM1.0

Gorji, Arman, Sanati, Nima, Pouria, Amir Hossein, Mehrnia, Somayeh Sadat, Hacihaliloglu, Ilker, Rahmim, Arman, Salmanpour, Mohammad R.

arXiv.org Artificial Intelligence

Radiomics-based AI models show promise for breast cancer diagnosis but often lack interpretability, limiting clinical adoption. This study addresses the gap between radiomic features (RF) and the standardized BI-RADS lexicon by proposing a dual-dictionary framework. First, a Clinically-Informed Feature Interpretation Dictionary (CIFID) was created by mapping 56 RFs to BI-RADS descriptors (shape, margin, internal enhancement) through literature and expert review. The framework was applied to classify triple-negative breast cancer (TNBC) versus non-TNBC using dynamic contrast-enhanced MRI from a multi-institutional cohort of 1,549 patients. We trained 27 machine learning classifiers with 27 feature selection methods. SHapley Additive exPlanations (SHAP) were used to interpret predictions and generate a complementary Data-Driven Feature Interpretation Dictionary (DDFID) for 52 additional RFs. The best model, combining Variance Inflation Factor (VIF) selection with Extra Trees Classifier, achieved an average cross-validation accuracy of 0.83. Key predictive RFs aligned with clinical knowledge: higher Sphericity (round/oval shape) and lower Busyness (more homogeneous enhancement) were associated with TNBC. The framework confirmed known imaging biomarkers and uncovered novel, interpretable associations. This dual-dictionary approach (BM1.0) enhances AI model transparency and supports the integration of RFs into routine breast cancer diagnosis and personalized care.


TSEML: A task-specific embedding-based method for few-shot classification of cancer molecular subtypes

Su, Ran, Shi, Rui, Cui, Hui, Xuan, Ping, Fang, Chengyan, Feng, Xikang, Jin, Qiangguo

arXiv.org Artificial Intelligence

Molecular subtyping of cancer is recognized as a critical and challenging upstream task for personalized therapy. Existing deep learning methods have achieved significant performance in this domain when abundant data samples are available. However, the acquisition of densely labeled samples for cancer molecular subtypes remains a significant challenge for conventional data-intensive deep learning approaches. In this work, we focus on the few-shot molecular subtype prediction problem in heterogeneous and small cancer datasets, aiming to enhance precise diagnosis and personalized treatment. We first construct a new few-shot dataset for cancer molecular subtype classification and auxiliary cancer classification, named TCGA Few-Shot, from existing publicly available datasets. To effectively leverage the relevant knowledge from both tasks, we introduce a task-specific embedding-based meta-learning framework (TSEML). TSEML leverages the synergistic strengths of a model-agnostic meta-learning (MAML) approach and a prototypical network (ProtoNet) to capture diverse and fine-grained features. Comparative experiments conducted on the TCGA Few-Shot dataset demonstrate that our TSEML framework achieves superior performance in addressing the problem of few-shot molecular subtype classification.


Deep learning-based classification of breast cancer molecular subtypes from H&E whole-slide images

Tafavvoghi, Masoud, Sildnes, Anders, Rakaee, Mehrdad, Shvetsov, Nikita, Bongo, Lars Ailo, Busund, Lill-Tove Rasmussen, Møllersen, Kajsa

arXiv.org Artificial Intelligence

Classifying breast cancer molecular subtypes is crucial for tailoring treatment strategies. While immunohistochemistry (IHC) and gene expression profiling are standard methods for molecular subtyping, IHC can be subjective, and gene profiling is costly and not widely accessible in many regions. Previous approaches have highlighted the potential application of deep learning models on H&E-stained whole slide images (WSI) for molecular subtyping, but these efforts vary in their methods, datasets, and reported performance. In this work, we investigated whether H&E-stained WSIs could be solely leveraged to predict breast cancer molecular subtypes (luminal A, B, HER2-enriched, and Basal). We used 1,433 WSIs of breast cancer in a two-step pipeline: first, classifying tumor and non-tumor tiles to use only the tumor regions for molecular subtyping; and second, employing a One-vs-Rest (OvR) strategy to train four binary OvR classifiers and aggregating their results using an eXtreme Gradient Boosting (XGBoost) model. The pipeline was tested on 221 hold-out WSIs, achieving an overall macro F1 score of 0.95 for tumor detection and 0.73 for molecular subtyping. Our findings suggest that, with further validation, supervised deep learning models could serve as supportive tools for molecular subtyping in breast cancer. Our codes are made available to facilitate ongoing research and development.


UnPaSt: unsupervised patient stratification by differentially expressed biclusters in omics data

Hartung, Michael, Maier, Andreas, Delgado-Chaves, Fernando, Burankova, Yuliya, Isaeva, Olga I., Patroni, Fábio Malta de Sá, He, Daniel, Shannon, Casey, Kaufmann, Katharina, Lohmann, Jens, Savchik, Alexey, Hartebrodt, Anne, Chervontseva, Zoe, Firoozbakht, Farzaneh, Probul, Niklas, Zotova, Evgenia, Tsoy, Olga, Blumenthal, David B., Ester, Martin, Laske, Tanja, Baumbach, Jan, Zolotareva, Olga

arXiv.org Artificial Intelligence

Most complex diseases, including cancer and non-malignant diseases like asthma, have distinct molecular subtypes that require distinct clinical approaches. However, existing computational patient stratification methods have been benchmarked almost exclusively on cancer omics data and only perform well when mutually exclusive subtypes can be characterized by many biomarkers. Here, we contribute with a massive evaluation attempt, quantitatively exploring the power of 22 unsupervised patient stratification methods using both, simulated and real transcriptome data. From this experience, we developed UnPaSt (https://apps.cosy.bio/unpast/) optimizing unsupervised patient stratification, working even with only a limited number of subtype-predictive biomarkers. We evaluated all 23 methods on real-world breast cancer and asthma transcriptomics data. Although many methods reliably detected major breast cancer subtypes, only few identified Th2-high asthma, and UnPaSt significantly outperformed its closest competitors in both test datasets. Essentially, we showed that UnPaSt can detect many biologically insightful and reproducible patterns in omic datasets.


Classification of Luminal Subtypes in Full Mammogram Images Using Transfer Learning

Panambur, Adarsh Bhandary, Madhu, Prathmesh, Maier, Andreas

arXiv.org Artificial Intelligence

Automatic identification of patients with luminal and non-luminal subtypes during a routine mammography screening can support clinicians in streamlining breast cancer therapy planning. Recent machine learning techniques have shown promising results in molecular subtype classification in mammography; however, they are highly dependent on pixel-level annotations, handcrafted, and radiomic features. In this work, we provide initial insights into the luminal subtype classification in full mammogram images trained using only image-level labels. Transfer learning is applied from a breast abnormality classification task, to finetune a ResNet-18-based luminal versus non-luminal subtype classification task. We present and compare our results on the publicly available CMMD dataset and show that our approach significantly outperforms the baseline classifier by achieving a mean AUC score of 0.6688 and a mean F1 score of 0.6693 on the test dataset. The improvement over baseline is statistically significant, with a p-value of p<0.0001.


Attention-based Interpretable Regression of Gene Expression in Histology

Graziani, Mara, Marini, Niccolò, Deutschmann, Nicolas, Janakarajan, Nikita, Müller, Henning, Martínez, María Rodríguez

arXiv.org Artificial Intelligence

Interpretability of deep learning is widely used to evaluate the reliability of medical imaging models and reduce the risks of inaccurate patient recommendations. For models exceeding human performance, e.g. predicting RNA structure from microscopy images, interpretable modelling can be further used to uncover highly non-trivial patterns which are otherwise imperceptible to the human eye. We show that interpretability can reveal connections between the microscopic appearance of cancer tissue and its gene expression profiling. While exhaustive profiling of all genes from the histology images is still challenging, we estimate the expression values of a well-known subset of genes that is indicative of cancer molecular subtype, survival, and treatment response in colorectal cancer. Our approach successfully identifies meaningful information from the image slides, highlighting hotspots of high gene expression. Our method can help characterise how gene expression shapes tissue morphology and this may be beneficial for patient stratification in the pathology unit. The code is available on GitHub.


A Deeper Understanding of Breast Cancer

#artificialintelligence

The same technology that powers Siri and face recognition on your iPhone has also found success in medicine. By automatically analyzing microscopic images of breast tumor biopsies, artificial intelligence may one day help guide cancer treatments. This particular type of AI is called deep learning, and over the last few years has become a part of our everyday lives. Its applications continue to expand to areas like language translation and self-driving cars, enabled by massive repositories of data. While deep learning was first applied to recognizing people, cars, and other everyday objects in photographs, it has more recently been adapted to study cancer.


Cancer: A Computational Disease that AI Can Cure

Tenenbaum, Jay M. (CommerceNet) | Shrager, Jeff (CollabRx)

AI Magazine

Cancer kills millions of people each year. From an AI perspective, finding effective treatments for cancer is a high-dimensional search problem characterized by many molecularly distinct cancer subtypes, many potential targets and drug combinations, and a dearth of high quality data to connect molecular subtypes and treatments to responses. The broadening availability of molecular diagnostics and electronic medical records, presents both opportunities and challenges to apply AI techniques to personalize and improve cancer treatment. We discuss these in the context of Cancer Commons, a “rapid learning” community where patients, physicians, and researchers collect and analyze the molecular and clinical data from every cancer patient, and use these results to individualize therapies. Research opportunities include: adaptively-planning and executing individual treatment experiments across the whole patient population, inferring the causal mechanisms of tumors, predicting drug response in individuals, and generalizing these findings to new cases. The goal is to treat each patient in accord with the best available knowledge, and to continually update that knowledge to benefit subsequent patients. Achieving this goal is a worthy grand challenge for AI.